20 research outputs found
PIAT: Physics Informed Adversarial Training for Solving Partial Differential Equations
In this paper, we propose the physics informed adversarial training (PIAT) of
neural networks for solving nonlinear differential equations (NDE). It is
well-known that the standard training of neural networks results in non-smooth
functions. Adversarial training (AT) is an established defense mechanism
against adversarial attacks, which could also help in making the solution
smooth. AT include augmenting the training mini-batch with a perturbation that
makes the network output mismatch the desired output adversarially. Unlike
formal AT, which relies only on the training data, here we encode the governing
physical laws in the form of nonlinear differential equations using automatic
differentiation in the adversarial network architecture. We compare PIAT with
PINN to indicate the effectiveness of our method in solving NDEs for up to 10
dimensions. Moreover, we propose weight decay and Gaussian smoothing to
demonstrate the PIAT advantages. The code repository is available at
https://github.com/rohban-lab/PIAT
Incorporating Betweenness Centrality in Compressive Sensing for Congestion Detection
This paper presents a new Compressive Sensing (CS) scheme for detecting
network congested links. We focus on decreasing the required number of
measurements to detect all congested links in the context of network
tomography. We have expanded the LASSO objective function by adding a new term
corresponding to the prior knowledge based on the relationship between the
congested links and the corresponding link Betweenness Centrality (BC). The
accuracy of the proposed model is verified by simulations on two real datasets.
The results demonstrate that our model outperformed the state-of-the-art CS
based method with significant improvements in terms of F-Score
An Impossibility Result for High Dimensional Supervised Learning
We study high-dimensional asymptotic performance limits of binary supervised
classification problems where the class conditional densities are Gaussian with
unknown means and covariances and the number of signal dimensions scales faster
than the number of labeled training samples. We show that the Bayes error,
namely the minimum attainable error probability with complete distributional
knowledge and equally likely classes, can be arbitrarily close to zero and yet
the limiting minimax error probability of every supervised learning algorithm
is no better than a random coin toss. In contrast to related studies where the
classification difficulty (Bayes error) is made to vanish, we hold it constant
when taking high-dimensional limits. In contrast to VC-dimension based minimax
lower bounds that consider the worst case error probability over all
distributions that have a fixed Bayes error, our worst case is over the family
of Gaussian distributions with constant Bayes error. We also show that a
nontrivial asymptotic minimax error probability can only be attained for
parametric subsets of zero measure (in a suitable measure space). These results
expose the fundamental importance of prior knowledge and suggest that unless we
impose strong structural constraints, such as sparsity, on the parametric
space, supervised learning may be ineffective in high dimensional small sample
settings.Comment: This paper was submitted to the IEEE Information Theory Workshop
(ITW) 2013 on April 23, 201
Novel Pipeline for Diagnosing Acute Lymphoblastic Leukemia Sensitive to Related Biomarkers
Acute Lymphoblastic Leukemia (ALL) is one of the most common types of
childhood blood cancer. The quick start of the treatment process is critical to
saving the patient's life, and for this reason, early diagnosis of this disease
is essential. Examining the blood smear images of these patients is one of the
methods used by expert doctors to diagnose this disease. Deep learning-based
methods have numerous applications in medical fields, as they have
significantly advanced in recent years. ALL diagnosis is not an exception in
this field, and several machine learning-based methods for this problem have
been proposed. In previous methods, high diagnostic accuracy was reported, but
our work showed that this alone is not sufficient, as it can lead to models
taking shortcuts and not making meaningful decisions. This issue arises due to
the small size of medical training datasets. To address this, we constrained
our model to follow a pipeline inspired by experts' work. We also demonstrated
that, since a judgement based on only one image is insufficient, redefining the
problem as a multiple-instance learning problem is necessary for achieving a
practical result. Our model is the first to provide a solution to this problem
in a multiple-instance learning setup. We introduced a novel pipeline for
diagnosing ALL that approximates the process used by hematologists, is
sensitive to disease biomarkers, and achieves an accuracy of 96.15%, an
F1-score of 94.24%, a sensitivity of 97.56%, and a specificity of 90.91% on ALL
IDB 1. Our method was further evaluated on an out-of-distribution dataset,
which posed a challenging test and had acceptable performance. Notably, our
model was trained on a relatively small dataset, highlighting the potential for
our approach to be applied to other medical datasets with limited data
availability
Your Out-of-Distribution Detection Method is Not Robust!
Out-of-distribution (OOD) detection has recently gained substantial attention
due to the importance of identifying out-of-domain samples in reliability and
safety. Although OOD detection methods have advanced by a great deal, they are
still susceptible to adversarial examples, which is a violation of their
purpose. To mitigate this issue, several defenses have recently been proposed.
Nevertheless, these efforts remained ineffective, as their evaluations are
based on either small perturbation sizes, or weak attacks. In this work, we
re-examine these defenses against an end-to-end PGD attack on in/out data with
larger perturbation sizes, e.g. up to commonly used for the
CIFAR-10 dataset. Surprisingly, almost all of these defenses perform worse than
a random detection under the adversarial setting. Next, we aim to provide a
robust OOD detection method. In an ideal defense, the training should expose
the model to almost all possible adversarial perturbations, which can be
achieved through adversarial training. That is, such training perturbations
should based on both in- and out-of-distribution samples. Therefore, unlike OOD
detection in the standard setting, access to OOD, as well as in-distribution,
samples sounds necessary in the adversarial training setup. These tips lead us
to adopt generative OOD detection methods, such as OpenGAN, as a baseline. We
subsequently propose the Adversarially Trained Discriminator (ATD), which
utilizes a pre-trained robust model to extract robust features, and a generator
model to create OOD samples. Using ATD with CIFAR-10 and CIFAR-100 as the
in-distribution data, we could significantly outperform all previous methods in
the robust AUROC while maintaining high standard AUROC and classification
accuracy. The code repository is available at https://github.com/rohban-lab/ATD .Comment: Accepted to NeurIPS 202
Blacksmith: Fast Adversarial Training of Vision Transformers via a Mixture of Single-step and Multi-step Methods
Despite the remarkable success achieved by deep learning algorithms in
various domains, such as computer vision, they remain vulnerable to adversarial
perturbations. Adversarial Training (AT) stands out as one of the most
effective solutions to address this issue; however, single-step AT can lead to
Catastrophic Overfitting (CO). This scenario occurs when the adversarially
trained network suddenly loses robustness against multi-step attacks like
Projected Gradient Descent (PGD). Although several approaches have been
proposed to address this problem in Convolutional Neural Networks (CNNs), we
found out that they do not perform well when applied to Vision Transformers
(ViTs). In this paper, we propose Blacksmith, a novel training strategy to
overcome the CO problem, specifically in ViTs. Our approach utilizes either of
PGD-2 or Fast Gradient Sign Method (FGSM) randomly in a mini-batch during the
adversarial training of the neural network. This will increase the diversity of
our training attacks, which could potentially mitigate the CO issue. To manage
the increased training time resulting from this combination, we craft the PGD-2
attack based on only the first half of the layers, while FGSM is applied
end-to-end. Through our experiments, we demonstrate that our novel method
effectively prevents CO, achieves PGD-2 level performance, and outperforms
other existing techniques including N-FGSM, which is the state-of-the-art
method in fast training for CNNs